Initialization of Self-Organizing Maps: Principal Components Versus Random Initialization. A Case Study
نویسندگان
چکیده
The performance of the Self-Organizing Map (SOM) algorithm is dependent on the initial weights of the map. The different initialization methods can broadly be classified into random and data analysis based initialization approach. In this paper, the performance of random initialization (RI) approach is compared to that of principal component initialization (PCI) in which the initial map weights are chosen from the space of the principal component. Performance is evaluated by the fraction of variance unexplained (FVU). Datasets were classified into quasi-linear and non-linear and it was observed that RI performed better for non-linear datasets; however the performance of PCI approach remains inconclusive for quasi-linear datasets. Introduction Self–Organizing Map (SOM) can be considered as a non-linear generalization of the principal component analysis [14] and has found much application in data exploration especially in data visualization, vector quantization and dimension reduction. Inspired by biological neural networks, it is a type of artificial neural network which uses an unsupervised learning algorithm with the additional property that it preserves the topological mapping from input space to output space making it a great tool for visualization of high dimensional data in a lower dimension. Originally developed for visualization of distribution of metric vectors [12], SOM found early application in speech recognition. However, like clustering algorithms, the quality of learning of SOM is greatly influenced by the initial conditions: initial weight of the map, the neighbourhood function, the learning rate, sequence of training vector and number of iterations. [1][12][11]. Several initialization approaches have been developed and can be broadly grouped into two classes: random initialization and data analysis based initialization [1]. Due to many possible initial configurations when using random approach, several attempts are usually made and the best initial configuration is adopted. However, for the data analysis based approach, certain statistical data analysis and data classification methods are used to determine the initial configuration; a popular method is selecting the initial weights from the same space spanned by the linear principal component (first eigenvectors corresponding to the largest eigenvalues of the empirical covariance matrix). Modification to the PCA approach was done by [1] and over the years other initialization methods have been proposed. An example is given by [4]. In this paper we consider the performance in terms of the quality of learning of the SOM using the random initialization (RI) method (in which the initial weight is taking from the sample data) and the principal component initialization (PCI) method. The quality of learning is determined by the fraction of variance unexplained [8]. To ensure an exhaustive study, synthetic data sets distributed along various shapes of only 2-dimensions are considered in this study and the map is 1-dimensional. 1 Dimension SOM is very important, for example, for approximation of principal curves. The experiment was performed using the PCA, SOM and GSOM applet available online [8].
منابع مشابه
SOM: Stochastic initialization versus principal components
Selection of a good initial approximation is a well known problem for all iterative methods of data approximation, from k -means to Self-Organizing Maps (SOM) and manifold learning. The quality of the resulting data approximation depends on the initial approximation. Principal components are popular as an initial approximation for many methods of nonlinear dimensionality reduction because its c...
متن کاملComparing Face Detection and Recognition Techniques
This paper implements and compares different techniques for face detection and recognition. One is find where the face is located in the images that is face detection and second is face recognition that is identifying the person. We study three techniques in this paper: Face detection using self organizing map (SOM), Face recognition by projection and nearest neighbor and Face recognition using...
متن کاملSelf-organizing Maps as Substitutes for K-Means Clustering
One of the most widely used clustering techniques used in GISc problems is the k-means algorithm. One of the most important issues in the correct use of k-means is the initialization procedure that ultimately determines which part of the solution space will be searched. In this paper we briefly review different initialization procedures, and propose Kohonen’s SelfOrganizing Maps as the most con...
متن کاملResistance Spot Welding Process Identification and Initialization Based on Self-Organizing Maps
Resistance spot welding is used to join two or more metal objects together, and the technique is in widespread use in, for example, the automotive and electrical industries. This paper discusses both the identification of different spot welding processes and the process initialization parameters leading to highquality welding joints. In this research, self-organizing maps (SOMs) were used, and ...
متن کاملCombination of Reinforcement Learning and Dynamic Self Organizing Map for Robot Arm Control
This paper shows that a system with two link arm can obtain arm reaching movement to a target object by combination of reinforcement learning and dynamic self organizing map. Proposed model in this paper present state and action space of reinforcement learning with dynamis self organizing maps. Because these spaces are continuous. proposed model uses two dynamic self-organizing maps (DSOM) to e...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1210.5873 شماره
صفحات -
تاریخ انتشار 2012